This notebook is to keep track of results as they are generated on the final dataset for the Autoimmunity and Health project project. The project aims to quantify diagnostic journeys of chronically ill folk in Australia. Recruitment for both chronically ill participants as well as healthy controls will continue till June 2022. Accompanying code to generate images is in AD-report-code.R
Overview of participating cohort.
To date, 1366 people have participated in this project by taking the survey, 1183 chronically ill folk and 186 control cohort, recording 176 illnesses. Their distribution by resident state can be seen in the chart below, which is on par with the polulation density ratio between the states.
Age by gender distribution

Age, gender and cohort distribution
Gender breakdown between chronically ill and control cohorts by percentage:
Visualisaztion to compare age range between controls and chronic illness cohort shows that there is no statistical difference for age representation between HC and CI cohorts.
How about by gender? Here, any identification other than Male and Female was grouped into Other due to the number of respondents with that identity. 
Ethnicity

Income

Education level

Relationship status

Misdiagnsosis

There is a significant association between length to diagnosis and misdiagnosis based on Fisher’s exact test.
##
## Fisher's Exact Test for Count Data with simulated p-value (based on
## 2000 replicates)
##
## data: testx
## p-value = 0.0004998
## alternative hypothesis: two.sided
Autoimmune diorders and chronic illnesses represented in the cohort.
Primary illnesses reported.
In this cohort, 180 chronic illnesses have been reported by participants. Due to request from the chronic illness community, Hypermobility spectrum disorders were grouped together which include Ehlers-Danlos syndrome (EDS) (including variants such as hEDS, vEDS) as well as Hypermobility spectrum disorder (HSD) as diagnostic critreria changed in the last three years where hEDS would now be usually diagnosed as HSD. Further, vasculitis disorder were grouped together also as the underlying disease mechanism consists of vascular inflammation that presents in different tissues or types of blood vessels (i.e. Wegener’s granulomatosis, Takayasu’s Arteritis, Susac’s syndrome, Giant cell arteritis, Essential mixed cryoglobulinemia, Eosinophilic granulomatosis with polyangiitis (EGPA), cerebral vasculitis, general vasculitis, Leukocytoclastic vasculitis, Microscopic polyangiitis, Lymphocytic vasculitis, Henoch Schonlein purpura and Cutaneous small vessel vasculitis). Individually, these rare illnesses were so minorly represented as to not give any statistical power.

Another visual representation with more informative values are bar charts, as shown below. For this example, any illnesses with more than 10 entries are selected 
Comorbidities found among participants.
Co-morbidities were reported in the control cohort also, however I think it is possible to not have to merge these as it is primarily asthma and mental illness. 
Upset plot of top 10 represented ADs with misdiagnosis rate and employment status

Violing plot of y = Number of ADs, x = Length dx
SF36
Symptoms
Overview of symtptoms experienced
Symptom distribution between Chronically ill and control cohorts
## $`Chronically ill`
## symptom
## severity bruise chills conc fatigue
## None 24.6% (312) 46.5% (589) 12.4% (157) 3.5% (44)
## Very mild 13.4% (170) 13.2% (168) 12.5% (158) 3.2% (41)
## Mild 15.0% (190) 13.6% (172) 19.1% (242) 9.9% (126)
## Moderate 22.2% (282) 9.1% (115) 26.2% (332) 29.9% (379)
## Severe 10.7% (136) 3.4% (43) 13.0% (165) 30.4% (385)
## Very severe 2.9% (37) 0.6% (8) 6.0% (76) 13.5% (171)
## <NA> 11.1% (141) 13.6% (173) 10.9% (138) 9.6% (122)
## Total 100.0% (1268) 100.0% (1268) 100.0% (1268) 100.0% (1268)
##
## fever glands hair joint memory
## 56.9% (722) 41.8% (530) 38.8% (492) 15.4% (195) 21.6% (274)
## 10.6% (135) 16.0% (203) 12.9% (163) 9.0% (114) 14.5% (184)
## 10.9% (138) 13.8% (175) 15.3% (194) 14.1% (179) 21.7% (275)
## 5.9% (75) 11.1% (141) 14.2% (180) 27.9% (354) 19.7% (250)
## 2.1% (26) 3.5% (45) 5.4% (68) 17.0% (216) 7.4% (94)
## 0.8% (10) 0.7% (9) 1.8% (23) 5.9% (75) 3.2% (41)
## 12.8% (162) 13.0% (165) 11.7% (148) 10.6% (135) 11.8% (150)
## 100.0% (1268) 100.0% (1268) 100.0% (1268) 100.0% (1268) 100.0% (1268)
##
## pain skin skin2 stomach weak
## 11.7% (148) 21.1% (267) 35.5% (450) 12.1% (153) 13.7% (174)
## 9.6% (122) 13.2% (168) 11.5% (146) 8.5% (108) 10.6% (134)
## 10.2% (129) 21.3% (270) 14.6% (185) 16.1% (204) 16.5% (209)
## 26.3% (334) 23.5% (298) 16.0% (203) 29.7% (376) 25.2% (319)
## 21.8% (276) 7.0% (89) 6.7% (85) 16.2% (205) 14.4% (183)
## 10.0% (127) 2.7% (34) 2.8% (36) 7.3% (93) 8.3% (105)
## 10.4% (132) 11.2% (142) 12.9% (163) 10.2% (129) 11.4% (144)
## 100.0% (1268) 100.0% (1268) 100.0% (1268) 100.0% (1268) 100.0% (1268)
##
## $`Control group`
## symptom
## severity bruise chills conc fatigue fever
## None 48.5% (49) 77.2% (78) 39.6% (40) 24.8% (25) 81.2% (82)
## Very mild 12.9% (13) 4.0% (4) 22.8% (23) 18.8% (19) 1.0% (1)
## Mild 11.9% (12) 3.0% (3) 7.9% (8) 15.8% (16) 3.0% (3)
## Moderate 6.9% (7) 1.0% (1) 11.9% (12) 20.8% (21) 0.0% (0)
## Severe 3.0% (3) 0.0% (0) 3.0% (3) 3.0% (3) 0.0% (0)
## Very severe 1.0% (1) 0.0% (0) 1.0% (1) 2.0% (2) 0.0% (0)
## <NA> 15.8% (16) 14.9% (15) 13.9% (14) 14.9% (15) 14.9% (15)
## Total 100.0% (101) 100.0% (101) 100.0% (101) 100.0% (101) 100.0% (101)
##
## glands hair joint memory pain skin
## 75.2% (76) 63.4% (64) 49.5% (50) 56.4% (57) 48.5% (49) 47.5% (48)
## 5.9% (6) 8.9% (9) 10.9% (11) 14.9% (15) 14.9% (15) 9.9% (10)
## 4.0% (4) 6.9% (7) 10.9% (11) 7.9% (8) 7.9% (8) 16.8% (17)
## 0.0% (0) 4.0% (4) 10.9% (11) 5.9% (6) 10.9% (11) 10.9% (11)
## 0.0% (0) 1.0% (1) 3.0% (3) 1.0% (1) 3.0% (3) 1.0% (1)
## 0.0% (0) 2.0% (2) 1.0% (1) 1.0% (1) 1.0% (1) 2.0% (2)
## 14.9% (15) 13.9% (14) 13.9% (14) 12.9% (13) 13.9% (14) 11.9% (12)
## 100.0% (101) 100.0% (101) 100.0% (101) 100.0% (101) 100.0% (101) 100.0% (101)
##
## skin2 stomach weak
## 61.4% (62) 39.6% (40) 56.4% (57)
## 6.9% (7) 11.9% (12) 16.8% (17)
## 7.9% (8) 11.9% (12) 5.0% (5)
## 5.9% (6) 15.8% (16) 4.0% (4)
## 1.0% (1) 5.0% (5) 2.0% (2)
## 1.0% (1) 2.0% (2) 1.0% (1)
## 15.8% (16) 13.9% (14) 14.9% (15)
## 100.0% (101) 100.0% (101) 100.0% (101)

Heatmaps and networks of symptoms
Overall symptoms

Gender difference in symptoms
Cohort difference in symptoms
Differential network control and chronic cohorts

Misdiagnosis and diagnosis
Differential network between correct and misdiagnosis
